Skip to content

feat(deploy): accept env_vars on initial POST + document claim flow#4

Merged
mastermanas805 merged 1 commit into
masterfrom
feat/deploy-env-on-post
May 11, 2026
Merged

feat(deploy): accept env_vars on initial POST + document claim flow#4
mastermanas805 merged 1 commit into
masterfrom
feat/deploy-env-on-post

Conversation

@mastermanas805
Copy link
Copy Markdown
Member

Summary

1. env_vars on initial POST

`POST /deploy/new` now accepts an optional multipart `env_vars` field — a JSON object `{KEY:"value", ...}` merged into the deployed pod's env on the first build. Replaces the (POST /deploy/new) → wait → (PATCH /env) → (POST /redeploy) round-trip that doubled time-to-live-URL for any app that actually needed env config.

vault://KEY refs resolve at deploy time exactly as in the PATCH flow. Reserved underscore-prefixed keys are silently dropped.

2. Document the claim flow in OpenAPI

`bearerAuth` now describes the full agent path: anonymous provision → `/claim` with email → session JWT → `/api/v1/api-keys` for unattended use. Without this, an agent reading `/openapi.json` had no signal where the JWT comes from.

Test plan

  • `TestOpenAPISpecParses` — catches any future raw-string-literal escape bug
  • `TestOpenAPI_DeployRequestHasEnvVars` — guards env_vars in the schema
  • `TestOpenAPI_BearerAuthDocumentsClaimFlow` — guards the auth-flow description
  • `go build ./...` passes
  • Integration test for the success path (auth + bad env_vars JSON → 400) — follow-up; needs the DB harness

🤖 Generated with Claude Code

Two contract additions friction-tested 2026-05-11:

1. POST /deploy/new now accepts an optional multipart "env_vars" field —
   a JSON object {KEY:"value", ...} merged into the deployed pod's env
   on the first build. Replaces the previous
     POST /deploy/new → wait → PATCH /env → POST /redeploy
   round-trip pattern that doubled time-to-live-URL for any app that
   actually needed env config (which is all of them). vault://KEY refs
   resolve at deploy time exactly as in the PATCH flow.

   Reserved underscore-prefixed keys are silently dropped to avoid
   collisions with internal markers (e.g. _name).

2. OpenAPI bearerAuth scheme now documents the full agent auth path:
   call any anonymous provisioning endpoint → /claim with email →
   session JWT → POST /api/v1/api-keys for unattended use. Without this,
   an agent reading /openapi.json had no machine-readable signal where
   the JWT comes from, and got 401s with no recovery hint.

Adds DeployRequest.env_vars to the schema with usage guidance, plus a
note on the current ~1 MiB build-context cap (the k8s Secret limit; the
form claims 50 MB).

Tests:
- TestOpenAPISpecParses guards against another raw-string-literal escape
  bug like the one introduced + fixed mid-PR.
- TestOpenAPI_DeployRequestHasEnvVars guards the env_vars contract.
- TestOpenAPI_BearerAuthDocumentsClaimFlow guards that the auth-flow
  description stays present (mentions /claim, anonymous, api-keys).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mastermanas805 mastermanas805 merged commit ea76df9 into master May 11, 2026
@mastermanas805 mastermanas805 deleted the feat/deploy-env-on-post branch May 11, 2026 08:14
mastermanas805 added a commit that referenced this pull request May 11, 2026
#10)

Earlier PRs (#4, #6, #9) shipped OpenAPI schema tests but lacked
handler-level behavior tests. Code reviewers flagged the gap — a
schema test catches "field documented" regressions but misses "field
actually emitted" or "input actually parsed" regressions.

This PR backfills behavior tests for four shipped behaviors:

  1. upgradeNote / limitExceededNote copy (PR #9, friction #13)
     TestUpgradeNote_DoesNotMentionTrial         — 2 sub-cases
     TestLimitExceededNote_DoesNotMentionTrial   — 4 sub-cases
     Guards: no "14-day trial" framing, contains "Claim to keep" +
     "$9/mo", no instant.dev/start leakage.

  2. POST /api/v1/whoami (PR #6, friction #9)
     TestWhoami_NoTokenReturns401          — 401 on missing bearer
     TestWhoami_ReturnsIdentityForAuthedRequest
       — 200 with uid/tid claims; plan_tier enrichment when DB hit
     Test app now wires /api/v1/whoami so this and future tests can
     hit it through the full RequireAuth middleware.

  3. POST /deploy/new env_vars JSON parsing (PR #4, friction #11)
     TestDeployNew_EnvVarsJSON_Parsed_Into_InitEnv
       — valid JSON merges into deployment.EnvVars; underscore-prefixed
         keys silently stripped (_secret never leaks)
     TestDeployNew_EnvVarsInvalidJSON_Returns400
       — malformed JSON returns 400 error="invalid_env_vars"
         (not a generic 500)
     Includes a multipartDeployBody helper that other deploy tests
     can reuse without colliding with stack_test.go's name.

  4. upgrade_jwt in provisioning responses (PR #9, friction #16)
     TestAnonymousProvisionEmitsUpgradeJWT_OnDedup
       — dedup response includes raw upgrade_jwt JWT (no parsing)
         alongside the legacy upgrade URL; the two presentations of
         the same token must not drift.
     Skips cleanly when local test DB schema lags (env column).

All 10 new test cases pass against postgres:16-alpine + redis:7-alpine.
Total run time <1s.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mastermanas805 added a commit that referenced this pull request May 14, 2026
Agents that hit api.instanode.dev first looking for discovery (the convention
is to fetch /llms.txt before any other request) hit a 404 — only apex
instanode.dev currently serves it. Forward the request rather than
duplicating content so marketing stays the single source of truth.

Surfaced by Persona1 (anon AI agent) v6.2.2 retest finding #4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mastermanas805 added a commit that referenced this pull request May 17, 2026
…ning

Round-3 P2 remediation across deploy/stack/webhook/auth surfaces.

1. deploy.go Redeploy now rejects a deployment in a terminal status
   (expired/deleted/stopped) with 409 + error code
   `deployment_not_redeployable` — redeploying one would resurrect an
   over-TTL/over-cap workload. New models.IsDeploymentTerminal +
   DeployStatusStopped const.

2. stack.go Redeploy re-runs the per-tier deployments_apps cap check when
   the stack is NOT in an active (slot-occupying) status — a failed/stopped
   stack flipping back to `building` could take a team to cap+1. New
   models.IsStackActive.

3. stack.go empty-env vault fallback changed from "production" to
   models.EnvDefault (development) at both new + redeploy sites — convention
   #11: a no-env legacy stack must not silently read production secrets.

4. deploy_teardown_reconciler.go increments a new
   metrics.DeployTeardownMarkFailed counter when MarkDeploymentTornDown
   fails — a persistently stuck row is now alertable in NR, not a silent log.

5. auth.go findOrCreateUserGitHub now matches an existing account by email
   (GetUserByEmail) and links github_id via new models.LinkGitHubID instead
   of forking a new team/user — mirrors findOrCreateUserGoogle and rejects
   takeover of an account already linked to a different GitHub ID. The
   /user/emails fallback now filters on Verified && Primary.

6. (already correct) models.CreateUser already routes email through
   NormalizeEmail at the write boundary — every OAuth/magic-link/claim call
   site is covered. No change needed; verified.

7. webhook.go receive_url is now built from a fixed server-controlled base
   (new webhookReceiveBaseURL: API_PUBLIC_URL / compiled-in base; c.BaseURL()
   only as a non-production dev fallback) instead of the client-controllable
   Host header. The URL is encrypted + persisted, so a client-settable host
   was a persistence-poisoning vector.

8. webhook.go Receive + ListRequests reject any non-webhook resource token
   with 404 — GetResourceByToken selects by token only, so a postgres/redis
   token previously passed.

9. auth.go GoogleAuthURL drops the impossible url.Parse-error 500 branch
   (the argument is a compile-time constant) — matches GoogleStart.

Regression tests: models/redeploy_guard_test.go (IsDeploymentTerminal,
IsStackActive), models/link_github_id_test.go (LinkGitHubID),
handlers/webhook_receive_base_url_test.go (#7), handlers/p2_roundup_test.go
(#1 error code, #4 metric, #9 GoogleAuthURL), and a wrong-resource-type case
appended to handlers/webhook_test.go (#8).

go build ./... and go vet ./... pass. New no-DB regression tests pass; the
DB/Redis-backed suites require a test Postgres/Redis (unavailable in this
environment).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mastermanas805 added a commit that referenced this pull request May 20, 2026
…/B18

Closes a batch of P2/P3 envelope-contract and ordering issues identified by
the B9 (provisioning), B10 (auth/ratelimit/quota), and B18 (input fuzz)
bug-bash reports. All fixes carry inline CLAUDE.md rule-17 coverage notes.

Fixes:

1. **B18 M4** — `POST /storage/:token/presign` validates body before checking
   token existence. Pre-fix, a random UUID returned `invalid_operation` (400)
   before the existence check fired. Reordered: token parse → resource
   lookup → body-shape validation. Closes information-flow risk if validators
   ever loosen.
   File: internal/handlers/storage_presign.go

2. **B18 M2** — Remove silent 120-byte truncation in sanitizeName. The
   authoritative length bound was already requireName's 64-rune gate; the
   second silent cap created a latent footgun if the name regex ever
   loosens to allow multi-byte runes. Updated regression test for the
   single-gate contract.
   Files: internal/handlers/provision_helper.go, provision_helper_test.go

3. **B18 M3** — Document the intentional UUID-shape-before-auth ordering on
   `GET /api/v1/webhooks/:token/requests`. The webhook token is a
   public-by-design capability (lands in HTTP headers/logs/outbound URLs);
   "well-formed-but-unknown" is not an oracle leak. Doc-only comment so
   future refactors preserve the intent.
   File: internal/handlers/webhook.go

4. **B18 L1** — Surface `X-Instant-Notice: name_normalized` header when
   sanitizeName mutates the request name (CRLF / tab / NUL / HTML-special
   chars stripped). Pre-fix the mutation was silent — agents looking up
   "db_for_user\n" later by exact name would never find the persisted
   "db_for_user". Header-only signal; does NOT fail the request (the
   strip is a deliberate hardening on top of the regex).
   File: internal/handlers/provision_helper.go

5. **B18 L2** — `parseProvisionBody` returns 415 `unsupported_media_type`
   when the request carries an explicit non-JSON Content-Type
   (application/xml etc.). Pre-fix, sending XML with `Content-Type:
   application/xml` returned 400 `name_required` — a misleading code that
   cost the caller one extra debugging cycle. The OpenAPI spec only
   declares `application/json`; 415 is the RFC-correct status.
   File: internal/handlers/provision_helper.go

6. **B10 P2-3** — Razorpay webhook invalid-signature envelope hydrated with
   the canonical ErrorResponse shape. Pre-fix, signature failures returned
   `{ok:false,error:"invalid_signature"}` with no request_id, message,
   retry_after_seconds, or agent_action. Razorpay support always asks for
   the request_id when a webhook fails. Same hydration applied to the
   invalid_payload path.
   File: internal/handlers/billing.go

7. **B10 P2-4** — Add `WWW-Authenticate: Bearer realm="instanode"` to every
   401 from respondUnauthorized. RFC 6750 §3 requires this header on every
   401 from a Bearer-protected resource. Pre-fix only the audience-mismatch
   path emitted it. OAuth-aware clients and HTTP debugging tools look for
   it.
   File: internal/middleware/auth.go

Gate (matches CI/deploy.yml):
- `go build ./...` — green
- `go vet ./...` — green
- `go test ./... -short -count=1 -p 1` — green on every modified package;
  pre-existing failures (12 in handlers + 2 in models + 3 B13 contract
  tests) verified unchanged-against-master by stashing the patch and
  re-running the same suite. All pre-existing flakes documented in
  CLAUDE.md "Known Design Gaps".

Skipped (already shipped today, per brief):
- AESKeyring (a3155a5), B5/B11/B13/B7 (0c7991c), presign middleware (PR
  #122, not yet on master), 768c0ca's 8 fixes.

Coverage:
- B19 finding #1 (presign middleware) — already shipped in PR #122; this
  PR does not duplicate.
- B19 finding #4 (lease-recovery RTO) — worker-side, tracked as task #245.
- B9 P3-F8 (X-RateLimit-Remaining: 0 on success) — investigated; the math
  in rate_limit.go is correct (limit-count). The reported 0-on-success is
  not reproducible from the code path; left for in-prod re-probe after
  this lands.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mastermanas805 added a commit that referenced this pull request May 21, 2026
…/B18

Closes a batch of P2/P3 envelope-contract and ordering issues identified by
the B9 (provisioning), B10 (auth/ratelimit/quota), and B18 (input fuzz)
bug-bash reports. All fixes carry inline CLAUDE.md rule-17 coverage notes.

Fixes:

1. **B18 M4** — `POST /storage/:token/presign` validates body before checking
   token existence. Pre-fix, a random UUID returned `invalid_operation` (400)
   before the existence check fired. Reordered: token parse → resource
   lookup → body-shape validation. Closes information-flow risk if validators
   ever loosen.
   File: internal/handlers/storage_presign.go

2. **B18 M2** — Remove silent 120-byte truncation in sanitizeName. The
   authoritative length bound was already requireName's 64-rune gate; the
   second silent cap created a latent footgun if the name regex ever
   loosens to allow multi-byte runes. Updated regression test for the
   single-gate contract.
   Files: internal/handlers/provision_helper.go, provision_helper_test.go

3. **B18 M3** — Document the intentional UUID-shape-before-auth ordering on
   `GET /api/v1/webhooks/:token/requests`. The webhook token is a
   public-by-design capability (lands in HTTP headers/logs/outbound URLs);
   "well-formed-but-unknown" is not an oracle leak. Doc-only comment so
   future refactors preserve the intent.
   File: internal/handlers/webhook.go

4. **B18 L1** — Surface `X-Instant-Notice: name_normalized` header when
   sanitizeName mutates the request name (CRLF / tab / NUL / HTML-special
   chars stripped). Pre-fix the mutation was silent — agents looking up
   "db_for_user\n" later by exact name would never find the persisted
   "db_for_user". Header-only signal; does NOT fail the request (the
   strip is a deliberate hardening on top of the regex).
   File: internal/handlers/provision_helper.go

5. **B18 L2** — `parseProvisionBody` returns 415 `unsupported_media_type`
   when the request carries an explicit non-JSON Content-Type
   (application/xml etc.). Pre-fix, sending XML with `Content-Type:
   application/xml` returned 400 `name_required` — a misleading code that
   cost the caller one extra debugging cycle. The OpenAPI spec only
   declares `application/json`; 415 is the RFC-correct status.
   File: internal/handlers/provision_helper.go

6. **B10 P2-3** — Razorpay webhook invalid-signature envelope hydrated with
   the canonical ErrorResponse shape. Pre-fix, signature failures returned
   `{ok:false,error:"invalid_signature"}` with no request_id, message,
   retry_after_seconds, or agent_action. Razorpay support always asks for
   the request_id when a webhook fails. Same hydration applied to the
   invalid_payload path.
   File: internal/handlers/billing.go

7. **B10 P2-4** — Add `WWW-Authenticate: Bearer realm="instanode"` to every
   401 from respondUnauthorized. RFC 6750 §3 requires this header on every
   401 from a Bearer-protected resource. Pre-fix only the audience-mismatch
   path emitted it. OAuth-aware clients and HTTP debugging tools look for
   it.
   File: internal/middleware/auth.go

Gate (matches CI/deploy.yml):
- `go build ./...` — green
- `go vet ./...` — green
- `go test ./... -short -count=1 -p 1` — green on every modified package;
  pre-existing failures (12 in handlers + 2 in models + 3 B13 contract
  tests) verified unchanged-against-master by stashing the patch and
  re-running the same suite. All pre-existing flakes documented in
  CLAUDE.md "Known Design Gaps".

Skipped (already shipped today, per brief):
- AESKeyring (ed55c41), B5/B11/B13/B7 (ed14581), presign middleware (PR
  #122, not yet on master), f1ba49b's 8 fixes.

Coverage:
- B19 finding #1 (presign middleware) — already shipped in PR #122; this
  PR does not duplicate.
- B19 finding #4 (lease-recovery RTO) — worker-side, tracked as task #245.
- B9 P3-F8 (X-RateLimit-Remaining: 0 on success) — investigated; the math
  in rate_limit.go is correct (limit-count). The reported 0-on-success is
  not reproducible from the code path; left for in-prod re-probe after
  this lands.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant